智能论文笔记

Learning with an Evolving Class Ontology

Zhiqiu Lin , Deepak Pathak , Yu-Xiong Wang , Deva Ramanan , Shu Kong

分类：计算机视觉 | 人工智能 | 机器学习

2022-10-10

Lifelong learners must recognize concept vocabularies that evolve over time. A common yet underexplored scenario is learning with class labels over time that refine/expand old classes. For example, humans learn to recognize ${\tt dog}$ before dog breeds. In practical settings, dataset $\textit{versioning}$ often introduces refinement to ontologies, such as autonomous vehicle benchmarks that refine a previous ${\tt vehicle}$ class into ${\tt school-bus}$ as autonomous operations expand to new cities. This paper formalizes a protocol for studying the problem of $\textit{Learning with Evolving Class Ontology}$ (LECO). LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e.g., dog breeds that refine the previous ${\tt dog}$). LECO explores such questions as whether to annotate new data or relabel the old, how to leverage coarse labels, and whether to finetune the previous TP's model or train from scratch. To answer these questions, we leverage insights from related problems such as class-incremental learning. We validate them under the LECO protocol through the lens of image classification (CIFAR and iNaturalist) and semantic segmentation (Mapillary). Our experiments lead to surprising conclusions; while the current status quo is to relabel existing datasets with new ontologies (such as COCO-to-LVIS or Mapillary1.2-to-2.0), LECO demonstrates that a far better strategy is to annotate $\textit{new}$ data with the new ontology. However, this produces an aggregate dataset with inconsistent old-vs-new labels, complicating learning. To address this challenge, we adopt methods from semi-supervised and partial-label learning. Such strategies can surprisingly be made near-optimal, approaching an "oracle" that learns on the aggregate dataset exhaustively labeled with the newest ontology.

translated by 谷歌翻译

XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention

Jiacheng Shi , Yuting He , Youyong Kong , Jean-Louis Coatrieux , Huazhong Shu , Guanyu Yang , Shuo Li

分类：计算机视觉

2022-06-15

有效的骨干网络对于基于深度学习的可变形医学图像注册（DMIR）很重要，因为它可以提取和匹配两个图像之间的特征，以发现互联网的相互对应。但是，现有的深网关注单图像，并且在配对图像上执行的注册任务有限。因此，我们推进了一个新型的骨干网络Xmorpher，用于DMIR中有效的相应特征表示。 1）它提出了一种新颖的完整变压器体系结构，包括双重平行特征提取网络，通过交叉注意交换信息，从而在逐渐提取相应的特征以逐渐提取最终有效注册时发现了多层次的语义对应。 2）它推进了交叉注意变压器（CAT）块，以建立图像之间的注意机制，该图像能够自动找到对应关系并提示特征在网络中有效融合。 3）它限制了基本窗口和搜索不同尺寸的窗口之间的注意力计算，因此着重于可变形注册的局部转换，并同时提高了计算效率。我们的Xmorpher没有任何铃铛和哨子，可在DSC上提高2.8％的素孔，以证明其对DMIR中配对图像的特征的有效表示。我们认为，我们的Xmorpher在更多配对的医学图像中具有巨大的应用潜力。我们的Xmorpher在https://github.com/solemoon/xmorpher上开放

translated by 谷歌翻译

Stochastic Planner-Actor-Critic for Unsupervised Deformable Image Registration

Ziwei Luo , Jing Hu , Xin Wang , Shu Hu , Bin Kong , Youbing Yin , Qi Song , Xi Wu , Siwei Lyu

分类：人工智能 | 计算机视觉

2021-12-14

由不同形状和非线性形状变化引起的机器官的大变形，对医学图像配准产生了重大挑战。传统的注册方法需要通过特定变形模型迭代地优化目标函数以及细致的参数调谐，但在具有大变形的图像中具有有限的能力。虽然基于深度学习的方法可以从输入图像到它们各自的变形字段中的复杂映射，但它是基于回归的，并且容易被卡在局部最小值，特别是当涉及大变形时。为此，我们呈现随机策划者 - 演员 - 评论家（SPAC），这是一种新的加强学习框架，可以执行逐步登记。关键概念通过每次步骤连续地翘曲运动图像，以最终与固定图像对齐。考虑到在传统的强化学习（RL）框架中处理高维连续动作和状态空间有挑战性，我们向标准演员 - 评论家模型引入了一个新的概念“计划”，这是低维度，可以促进演员生成易于高维行动。整个框架基于无监督的培训，并以端到端的方式运行。我们在几个2D和3D医学图像数据集上评估我们的方法，其中一些包含大变形。我们的经验结果强调了我们的工作实现了一致，显着的收益和优于最先进的方法。

translated by 谷歌翻译

RecGURU: Adversarial Learning of Generalized User Representations for Cross-Domain Recommendation

Chenglin Li , Mingjun Zhao , Huanming Zhang , Chenyun Yu , Lei Cheng , Guoqiang Shu , Beibei Kong , Di Niu

分类：人工智能

2021-11-19

跨域建议可以帮助缓解传统的连续推荐系统中的数据稀疏问题。在本文中，我们提出了Recguru算法框架，以在顺序推荐中生成包含跨域的用户信息的广义用户表示，即使在两个域中的最小或没有公共用户时也是如此。我们提出了一种自我细心的AutoEncoder来导出潜在用户表示，以及域鉴别器，其旨在预测所产生的潜在表示的原点域。我们提出了一种新的逆势学习方法来训练两个模块，以使从不同域生成的用户嵌入到每个用户的单个全局Gur。学习的Gur捕获了用户的整体偏好和特征，因此可以用于增强行为数据并改进在涉及用户的任何单个域中的推荐。在两个公共交叉域推荐数据集以及从现实世界应用程序收集的大型数据集进行了广泛的实验。结果表明，Recguru提高了性能，优于各种最先进的顺序推荐和跨域推荐方法。收集的数据将被释放以促进未来的研究。

translated by 谷歌翻译

Persia: A Hybrid System Scaling Deep Learning Based Recommenders up to 100 Trillion Parameters

Xiangru Lian , Binhang Yuan , Xuefeng Zhu , Yulong Wang , Yongjun He , Honghuan Wu , Lei Sun , Haodong Lyu , Chengjun Liu , Xing Dong

分类：机器学习

2021-11-10

基于深度学习的模型占主导地位的生产推荐系统的当前景观。此外，近年来目睹了模型规模的指数增长 - 从谷歌的2016年模型，最新的Facebook的型号有10亿个参数，具有12万亿参数。型号容量的每次跳跃都有显着的质量增强，这使我们相信100万亿参数的时代即将来临。然而，即使在工业规模数据中心内，这些模型的培训也在挑战。这种困难是从训练计算的惊人的异质性继承 - 模型的嵌入层可以包括总模型尺寸的99.99％，这是极其内存密集的;虽然其余的神经网络越来越多地计算密集型。为支持培训此类巨大模式，迫切需要有效的分布式培训系统。在本文中，我们通过仔细共同设计优化算法和分布式系统架构来解决这一挑战。具体而言，为了确保培训效率和训练精度，我们设计一种新型混合训练算法，其中嵌入层和密集的神经网络由不同的同步机制处理;然后，我们构建一个名为Persia的系统（短暂的并行推荐培训系统，其中包含混合加速），以支持这种混合培训算法。理论上的示范和实证研究均达到100万亿参数，以证明了波斯的系统设计和实施。我们将Pensia公开使用（在https://github.com/persiamml/persia），以便任何人都能够以100万亿参数的规模轻松培训推荐模型。

translated by 谷歌翻译

Multimodal Object Detection via Probabilistic Ensembling

Yi-Ting Chen , Jinghao Shi , Zelin Ye , Christoph Mertz , Deva Ramanan , Shu Kong

分类：计算机视觉

2021-04-07

使用多模式输入的对象检测可以改善许多安全性系统，例如自动驾驶汽车（AVS）。由白天和黑夜运行的AV动机，我们使用RGB和热摄像机研究多模式对象检测，因为后者在较差的照明下提供了更强的对象签名。我们探索融合来自不同方式的信息的策略。我们的关键贡献是一种概率结合技术，Proben，一种简单的非学习方法，可以将多模式的检测融合在一起。我们从贝叶斯的规则和第一原则中得出了探针，这些原则在跨模态上采用条件独立性。通过概率边缘化，当检测器不向同一物体发射时，概率可以优雅地处理缺失的方式。重要的是，即使有条件的独立性假设不存在，也可以显着改善多模式检测，例如，从其他融合方法（包括现成的内部和训练有素的内部）融合输出。我们在两个基准上验证了包含对齐（KAIST）和未对准（Flir）多模式图像的基准，这表明Proben的相对性能优于先前的工作超过13％！

translated by 谷歌翻译

TAToo: Vision-based Joint Tracking of Anatomy and Tool for Skull-base Surgery

Zhaoshuo Li , Hongchao Shu , Ruixing Liang , Anna Goodridge , Manish Sahu , Francis X. Creighton , Russell H. Taylor , Mathias Unberath

分类：计算机视觉 | 人工智能

2022-12-29

Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current clinical workflows and instrumentation. Methods: We present Tracker of Anatomy and Tool (TAToo). TAToo jointly tracks the rigid 3D motion of patient skull and surgical drill from stereo microscopic videos. TAToo estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, TAToo adopts a probabilistic formulation and enforces geometric constraints on the object level. Results: We validate TAToo on both simulation data, where ground truth motion is available, as well as on anthropomorphic phantom data, where optical tracking provides a strong baseline. We report sub-millimeter and millimeter inter-frame tracking accuracy for skull and drill, respectively, with rotation errors below 1{\deg}. We further illustrate how TAToo may be used in a surgical navigation setting. Conclusion: We present TAToo, which simultaneously tracks the surgical tool and the patient anatomy in skull-base surgery. TAToo directly predicts the motion from surgical videos, without the need of any markers. Our results show that the performance of TAToo compares favorably to competing approaches. Future work will include fine-tuning of our depth network to reach a 1 mm clinical accuracy goal desired for surgical applications in the skull base.

translated by 谷歌翻译

Robust Consensus Clustering and its Applications for Advertising Forecasting

Deguang Kong , Miao Lu , Konstantin Shmakov , Jian Yang

分类：机器学习 | 人工智能

2022-12-27

Consensus clustering aggregates partitions in order to find a better fit by reconciling clustering results from different sources/executions. In practice, there exist noise and outliers in clustering task, which, however, may significantly degrade the performance. To address this issue, we propose a novel algorithm -- robust consensus clustering that can find common ground truth among experts' opinions, which tends to be minimally affected by the bias caused by the outliers. In particular, we formalize the robust consensus clustering problem as a constraint optimization problem, and then derive an effective algorithm upon alternating direction method of multipliers (ADMM) with rigorous convergence guarantee. Our method outperforms the baselines on benchmarks. We apply the proposed method to the real-world advertising campaign segmentation and forecasting tasks using the proposed consensus clustering results based on the similarity computed via Kolmogorov-Smirnov Statistics. The accurate clustering result is helpful for building the advertiser profiles so as to perform the forecasting.

translated by 谷歌翻译

Do not Waste Money on Advertising Spend: Bid Recommendation via Concavity Changes

Deguang Kong , Konstantin Shmakov , Jian Yang

分类：人工智能 | 机器学习

2022-12-26

In computational advertising, a challenging problem is how to recommend the bid for advertisers to achieve the best return on investment (ROI) given budget constraint. This paper presents a bid recommendation scenario that discovers the concavity changes in click prediction curves. The recommended bid is derived based on the turning point from significant increase (i.e. concave downward) to slow increase (convex upward). Parametric learning based method is applied by solving the corresponding constraint optimization problem. Empirical studies on real-world advertising scenarios clearly demonstrate the performance gains for business metrics (including revenue increase, click increase and advertiser ROI increase).

translated by 谷歌翻译

Demystifying Advertising Campaign Bid Recommendation: A Constraint target CPA Goal Optimization

Deguang Kong , Konstantin Shmakov , Jian Yang

分类：人工智能 | 机器学习

2022-12-26

In cost-per-click (CPC) or cost-per-impression (CPM) advertising campaigns, advertisers always run the risk of spending the budget without getting enough conversions. Moreover, the bidding on advertising inventory has few connections with propensity one that can reach to target cost-per-acquisition (tCPA) goals. To address this problem, this paper presents a bid optimization scenario to achieve the desired tCPA goals for advertisers. In particular, we build the optimization engine to make a decision by solving the rigorously formalized constrained optimization problem, which leverages the bid landscape model learned from rich historical auction data using non-parametric learning. The proposed model can naturally recommend the bid that meets the advertisers' expectations by making inference over advertisers' historical auction behaviors, which essentially deals with the data challenges commonly faced by bid landscape modeling: incomplete logs in auctions, and uncertainty due to the variation and fluctuations in advertising bidding behaviors. The bid optimization model outperforms the baseline methods on real-world campaigns, and has been applied into a wide range of scenarios for performance improvement and revenue liftup.

translated by 谷歌翻译